Back

Biology Methods and Protocols

Oxford University Press (OUP)

Preprints posted in the last 7 days, ranked by how well they match Biology Methods and Protocols's content profile, based on 53 papers previously published here. The average preprint has a 0.08% match score for this journal, so anything above that is already an above-average fit.

1
Explainable, Lightweight Deep Learning for Colorectal Cancer Microsatellite Instability Screening in Low-Resource Settings

Adegbosin, O. T.; Patel, H.

2026-04-20 oncology 10.64898/2026.04.18.26350809 medRxiv
Top 0.1%
18.4%
Show abstract

BackgroundMicrosatellite stability status determination is important for prognostication and therapeutic decision making in colorectal cancer management, but the conventional methods for this assessment are not readily available, especially in low- and middle-income countries. Deep learning (DL) models have been proposed for addressing this problem; however, potential computational cost due to model complexity and inadequate explainability may limit their adoption in low-resource settings. This study explored the potential of explainable lightweight models for detection of microsatellite instability in colorectal cancer. MethodsDL models were trained using a public dataset of colorectal cancer histology images and then used to classify a set of test images into one of two classes: microsatellite instability or microsatellite stability. The models were compared for efficiency. Gradient-weighted class activation mapping (Grad-CAM) was used to interpret the models decision making. ResultsThe simpler convolutional neural network (CNN) trained from scratch had modest performance (accuracy=0.757, area under receiver-operating characteristic curve [AUROC]=0.840). With an attention mechanism added, these values increased, but specificity and sensitivity reduced. Pretrained models performed better than the ones trained from scratch, and EfficientNet_B0 had the best balance of high performance and low computational requirements (accuracy=0.936, AUROC=0.990, negative predictive value=0.923, specificity=0.953, 4,010,000 trainable parameters, 0.38 gigaFLOPs). However, a simple CNN model with attention mechanism had the best interpretability based on Grad-CAM. ConclusionThis study demonstrated that DL models that are lightweight when compared to previously proposed ones can be useful for colorectal cancer microsatellite instability screening in resource-limited settings while balancing performance and computational efficiency.

2
Assessing medication-related burden and medication adherence among older patients from Central Nepal: A machine learning approach

Giri, R.; Agrawal, R.; Lamichhane, S. R.; Barma, S.; Mahatara, R.

2026-04-23 geriatric medicine 10.64898/2026.04.22.26351447 medRxiv
Top 0.1%
8.9%
Show abstract

We are pleased to submit our Original article entitled "Assessing medication-related burden and medication adherence among older patients from Central Nepal: A machine learning approach" for consideration in your esteemed journal. In this paper, we assessed medication burden using validated Living with medicines Questionnaire (LMQ-3) and medication adherence using Adherence to Medication refills (ARMS) Scale. In this paper we analysed our result through machine learning approach in spite of traditional statistical approach to identify the complex factors influencing both. Six ML architectures (Ordinary Least Square, LightGBM, Random Forest, XGBoost, SVM, and Penalized linear regression) were employed to predict ARMS and LMQ scores using various socio-demographic, clinical and medication-related predictive features. Model explainability was provided through SHAP (Shapley Additive exPlanations). Our study identified the moderate medication burden with moderate non-adherence among older adults. Requiring assistance for medication and polypharmacy were the strongest drivers for the medication burden and non-adherence. The high predictive accuracy by ML suggests the appropriate clinical intervention like deprescribing to cope with the high prevalent medication burden and non-adherence among older adults in Nepal.

3
Comparison of foundation models and transfer learning strategies for diabetic retinopathy classification

Li, L. Y.; Lebiecka-Johansen, B.; Byberg, S.; Thambawita, V.; Hulman, A.

2026-04-20 health informatics 10.64898/2026.04.17.26351092 medRxiv
Top 0.1%
6.8%
Show abstract

Diabetic retinopathy (DR) is a leading cause of vision impairment, requiring accurate and scalable diagnostic tools. Foundation models are increasingly applied to clinical imaging, but concerns remain about their calibration. We evaluated DINOv3, RETFound, and VisionFM for DR classification using different transfer learning strategies in BRSET (n = 16,266) and mBRSET (n = 5,164). Models achieved high discrimination in binary classification (normal vs retinopathy) in BRSET (AUROC 0.90-0.98), with DINOv3 achieving the best under full fine-tuning (AUROC 0.98 [95% CI: 0.97-0.99]). External validation on mBRSET showed decreased performance for all models regardless of the fine-tuning strategy (AUROC 0.70-0.85), though fine-tuning improved performance. Foundation models achieved strong discrimination but poor calibration, generally overestimating DR risk. While the generalist model, DINOv3, benefited from deeper fine-tuning, miscalibration remained evident. These findings underscore the need to improve calibration and the comprehensive evaluation of foundation models, which are essential in clinical settings. Author summaryArtificial intelligence is increasingly being used to detect eye diseases such as diabetic retinopathy from retinal images. Recent advances have introduced "foundation models," which are trained on large datasets and can be adapted to new tasks. We aimed to evaluate how well these models perform in a clinical prediction context, with a focus not only on accuracy but also on how reliably they estimate disease risk. In this study, we compared different types of foundation models using two independent datasets from Brazil. We found that while these models were generally good at distinguishing between healthy and diseased eyes, their predicted risks were often poorly calibrated. In other words, the estimated probabilities did not consistently reflect the true likelihood of disease. We also examined whether adapting the models to the target population could improve performance. Although this approach led to improvements, calibration issues remained. However, post-training correction improved the agreement between predicted risks and observed outcomes. Our findings highlight an important gap between model performance and clinical usefulness. We suggest that improving the reliability of risk estimates is essential before such systems can be safely used in healthcare.

4
Analysis and Mitigation of Equipment-induced Shortcuts in AI Models for Laparoscopic Cholecystectomy

Protserov, S.; Repalo, A.; Mashouri, P.; Hunter, J.; Masino, C.; Madani, A.; Brudno, M.

2026-04-24 surgery 10.64898/2026.04.22.26351545 medRxiv
Top 0.1%
6.7%
Show abstract

Machine learning models have seen a lot of success in medical image segmentation domain. However, one of the challenges that they face are confounders or shortcuts: spurious correlations or biases in the training data that affect the resulting models. One example of such confounders for surgical machine learning is the setup of surgical equipment, including tools and lighting. Using the task of identification of safe and dangerous zones of dissection in laparoscopic cholecystectomy images and videos as a use-case, we inspect two equipment-induced biases: the presence of surgical tools in the field of view and the position of lighting. We propose methods for evaluating the severity of these biases and augmentation-based methods for mitigating them. We show that our tool bias mitigations improve the models' consistency under tool movements by 9 percentage points in the most inconsistent cases, and by 4 percentage points on average. Our lighting bias mitigations help reduce fraction of true dangerous zone pixels that may be predicted as safe under light changes from 5% to 1.5%, without compromising segmentation quality.

5
Comparing prognostic performance and reasoning between large language models and physicians

Gjertsen, M.; Yoon, W.; Afshar, M.; Temte, B.; Leding, B.; Halliday, S.; Bradley, K.; Kim, J.; Mitchell, J.; Sanders, A. K.; Croxford, E. L.; Caskey, J.; Churpek, M. M.; Mayampurath, A.; Gao, Y.; Miller, T.; Kruser, J. M.

2026-04-25 intensive care and critical care medicine 10.64898/2026.04.17.26350898 medRxiv
Top 0.1%
6.2%
Show abstract

Importance: Physicians routinely prognosticate to guide care delivery and shared decision making, particularly when caring for patients with critical illnesses. Yet, these physician estimates are prone to inaccuracy and uncertainty. Artificial intelligence, including large language models (LLMs), show promise in supporting or improving this prognostication. However, the performance of contemporary LLMs in prognosticating for the heterogeneous population of critically ill patients remains poorly understood. Objective: To characterize and compare the performance of LLMs and physicians when predicting 6-month mortality for hospitalized adults who survived critical illness. Design: Embedded mixed methods study with elicitation and comparison of prognostic estimates and reasoning from LLMs and practicing physicians. Setting: The publicly available, deidentified Medical Information Mart for Intensive Care (MIMIC)-IV v2.2 dataset. Participants: We randomly selected 100 hospitalizations of adult survivors of critical illness. Four contemporary LLMs (Open AI GPT-4o, o3- and o4-mini, and DeepSeek-R1) and 7 physicians provided independent prognostic estimates for each case (1,100 total estimates; 400 LLM and 700 physician). Main outcomes and measures: For each case, LLMs and physicians used the hospital discharge summary and demographics to predict 6-month mortality (yes/no) and provide their reasoning (free text). We assessed prognostic performance using accuracy, sensitivity, and specificity, and used inductive, qualitative content analysis to characterize reasonings. Results: Mean physician accuracy for predicting mortality was 70.1% (95% CI 63.7-76.4%), with sensitivity of 59.7% (95% CI 50.6-68.8%) and specificity of 80.6% (95% CI 71.7-88.2%). The top-performing LLM (OpenAI o4-mini) accuracy was 78.0% (95% CI 70.0-86.0%), with sensitivity of 80.0% (95% CI 67.4-90.2%) and specificity of 76.0% (95% CI 63.3-88.0%). The difference between mean physician and top-performing LLM accuracy was not statistically significant (p = 0.5). Qualitative analysis revealed similar patterns in LLM and physician expressed reasoning, except that physicians regularly and explicitly reported uncertainty while LLMs did not. Conclusion and Relevance: In this study, LLMs and physicians achieved comparable, moderate performance in predicting 6-month mortality after critical illness, with similar patterns in expressed reasoning. Our findings suggest LLMs could be used to support prognostication in clinical practice but also raise safety concerns due to the lack of LLM uncertainty expression.

6
Development and Evaluation of iSupport-Malaysia: A Multimedia Web-Based Psychoeducational Intervention for Dementia Caregivers

Loh, K. J.; Lee, W. L.; Ng, A. L. O.; Chung, F. F. L.; Renganathan, E.

2026-04-21 geriatric medicine 10.64898/2026.04.14.26350743 medRxiv
Top 0.2%
4.3%
Show abstract

BackgroundCaring for people with dementia can impose a considerable psychological burden on caregivers, yet access to caregiver support in Malaysia remains limited. The World Health Organizations iSupport for Dementia program provides dementia education via textual, e-learning format. However, a culturally adapted Malaysian version has not been available. ObjectiveThis study aimed to develop and gather user feedback on a culturally adapted, multimedia version of iSupport tailored for Malaysia (iSupport-Malaysia). MethodsGuided by a four-phase cultural adaptation framework, the generic iSupport content was translated into Bahasa Malaysia, adapted to local customs, and transformed into multimedia lessons on an e-learning platform. A mixed-methods design was used to explore user perceptions and evaluate usability through four homogeneous focus group discussions and 15 individual usability test sessions with informal caregivers (FG: n=9; UT: n=9) and healthcare professionals (FG: n=11; UT: n=6). Focus groups examined aesthetics, ease of use, clarity, cultural relevance, comprehensiveness, and satisfaction. Usability testing involved Think Aloud tasks, post-test questionnaires, and brief interviews. Qualitative data was analysed thematically, and descriptive statistics summarised usability performance. ResultsiSupport-Malaysia demonstrated good usability (M=74.3{+/-}18.0), with most tasks completed without assistance. Strengths included interactive learning activities, peer discussion features, and flexible self-paced learning. Content was viewed as culturally appropriate, credible, and useful. Suggested improvements included enhancing visual aesthetics, shortening videos, refining quizzes, and increasing practical relevance. ConclusionUser insights indicate that iSupport-Malaysia is usable and culturally appropriate. These findings will inform refinement of the platform prior to the pilot feasibility study and provide recommendations for future multimedia-based caregiver interventions.

7
Modality Fusion of MRI and Clinical Data for Glioma Tumour Grading

Kheirbakhsh, R.; Mathur, P.; Lawlor, A.

2026-04-22 health informatics 10.64898/2026.04.20.26351308 medRxiv
Top 0.3%
3.6%
Show abstract

Multimodal machine learning leverages complementary information from diverse data sources and has shown strong promise in medical imaging, where multimodal data is critical for clinical decision making. In glioma grading, integrating MRI modalities with clinical data can improve diagnostic accuracy, yet systematic comparisons of fusion strategies remain limited. This study evaluates early, intermediate, and late fusion approaches, addressing the question: How does the inclusion of clinical data alongside MRI modalities influence grading performance? To assess modality contributions, we design adaptable fusion layers and employ interpretability techniques, including attention-based analysis. Our results show that incorporating clinical data consistently outperforms unimodal and MRI-only baselines, with intermediate fusion yielding the most reliable gains. Beyond accuracy, the framework reveals how MRI and clinical features jointly shape predictions, underscoring the importance of both fusion design and interpretability for clinical adoption.

8
Vision Language Model for Coronary Angiogram Analysis and Report Generation: Development and Evaluation Study

Jiang, Q.; Ke, Y.; Sinisterra, L. G.; Elangovan, K.; Li, Z.; Yeo, K. K.; Jonathan, Y.; Ting, D. S. W.

2026-04-21 cardiovascular medicine 10.64898/2026.04.19.26351241 medRxiv
Top 0.4%
3.1%
Show abstract

Coronary artery disease is a leading cause of morbidity and mortality. Invasive coronary angiography is currently the gold standard in disease diagnosis. Several studies have attempted to use artificial intelligence (AI) to automate their interpretations with varying levels of success. However, most existing studies cannot generate detailed angiographic reports beyond simple classification or segmentation. This study aims to fine-tune and evaluate the performance of a Vision-Language Model (VLM) in coronary angiogram interpretation and report generation. Using twenty-thousand angiogram keyframes of 1987 patients collated across four unique datasets, we finetuned InternVL2-4B model with Low-Rank Adaptor weights that can perform stenosis detection, anatomy labelling, and report generation. The fine-tuned VLM achieved a precision of 0.56, recall of 0.64, and F1-score of 0.60 for stenosis detection. In anatomy segmentation, it attained a weighted precision of 0.50, recall of 0.43, and F1-score of 0.46, with higher scores in major vessel segments. Report generation integrating multiple angiographic projection views yielded an accuracy of 0.42, negative predictive value of 0.58 and specificity of 0.52. This study demonstrates the potential of using VLM to streamline angiogram interpretation to rapidly provide actionable information to guide management, support care in resource-limited settings, and audit the appropriateness of coronary interventions. AUTHOR SUMMARYCoronary artery disease has heavy disease burden worldwide and coronary angiogram is the gold standard imaging for its diagnosis. Interpreting these complex images and producing clinical reports require significant expertise and time. In this study, we fine-tuned and investigated an open-source VLM, InternVL2-4B, to interpret and report coronary angiogram images in key tasks including stenosis detection, anatomy identification, as well as full report generation. We also referenced the fine-tuned InternVL2-4B against state-of-the-art segmentation model, YOLOv8x, which was evaluated on the same test sets. We examined how machine learning metrics like the intersection over union score may not fully capture the clinical accuracy of model predictions and discussed the limitations of relying solely on these metrics for evaluating clinical AI systems. Although the model has not yet achieved expert-level interpretation, our results demonstrate the potential and feasibility of automating the reporting of coronary angiograms. Such systems could potentially assist cardiologists by improving reporting efficiency, highlightning lesions that may require review, and enabling automated calculations of clinical scores such as the SYNTAX score.

9
AI-Based Clinical Decision Support Systems for Secondary Caries on Bitewings: A Multi-Algorithm Comparison

Chaves, E. T.; Teunis, J. T.; Digmayer Romero, V. H.; van Nistelrooij, N.; Vinayahalingam, S.; Sezen-Hulsmans, D.; Mendes, F. M.; Huysmans, M.-C.; Cenci, M. S.; Lima, G. d. S.

2026-04-25 dentistry and oral medicine 10.64898/2026.04.17.26350883 medRxiv
Top 0.5%
2.2%
Show abstract

Background: Radiographic detection of caries lesions adjacent to restorations is challenging due to limitations of two-dimensional imaging and difficulties distinguishing true lesions from restorative or anatomical radiolucencies. Artificial intelligence (AI)-based clinical decision support systems (CDSSs) have been introduced to assist radiographic interpretation; however, different AI tools may yield variable diagnostic outputs, and their comparative performance remains unclear. Objective: To compare the diagnostic performance of commercial and experimental AI algorithms for detecting secondary caries lesions on bitewings. Methods: This cross-sectional diagnostic accuracy study included 200 anonymized bitewings comprising 885 restored tooth surfaces. A consensus group reference standard identified all surfaces with a caries lesion and classified each lesion by type (primary/secondary) and depth (enamel-only/dentin-involved). Five commercial (Second Opinion, CranioCatch, Diagnocat, DIO Inteligencia, and Align X-ray Insights) and three experimental (Mask R-CNN-based and Mask DINO-based) systems were tested. Diagnostic performance was expressed through sensitivity, specificity, and overall accuracy (95% CI). Comparisons used generalized estimating equations, adjusted for clustered data. Results: Specificity was high across all systems (0.957-0.986), confirming accurate recognition of non-carious surfaces, whereas sensitivity was moderate (0.327-0.487), reflecting frequent missed detections of enamel and dentin lesions. Accuracy ranged from 0.882 to 0.917, with no significant differences among models (p >= 0.05). Confounding factors, such as radiographic overlapping, marginal restoration defects, and cervical artifacts, were the main sources of misclassification. Conclusions: AI algorithms, regardless of architecture or commercial status, showed similar diagnostic capabilities and a conservative detection profile, favoring specificity over sensitivity. Improvements in dataset diversity, labeling precision, and explainability may further enhance reliability for secondary caries detection. Clinical Significance: AI-based CDSSs assist clinicians by providing consistent detection. Their high specificity is particularly valuable in minimizing unnecessary invasive treatments (overtreatment), though they should be used as adjuncts rather than a replacement for expert judgment.

10
Comparing Gleason Pattern 4 Measurement Approaches on Prostate Biopsy Using Machine Learning: A Proof-of-Principle Study

Buzoianu, M. M.; Yu, R.; Assel, M.; Bozkurt, A.; Aghdam, H.; Fine, S.; Vickers, A.

2026-04-24 oncology 10.64898/2026.04.23.26351615 medRxiv
Top 0.5%
2.2%
Show abstract

Objective: To demonstrate the proof of principle that machine learning (ML) can be used to quantify Gleason Pattern (GP) 4 on digitized biopsy slides using multiple measurement approaches, allowing direct comparison of their prognostic performance. Methods: We assembled a convenience sample of 726 patients with grade group 2-4 prostate cancer on systematic biopsy who underwent radical prostatectomy between 2014 and 2023. Digitized biopsy slides were analyzed using a machine-learning algorithm (PAIGE-AI) to quantify GP4 using multiple measurement approaches, particularly with respect to how gaps between cancer foci (interfocal stroma) were handled. GP4 extent was quantified using linear measurements or a pixel-based area metric. Discrimination of each GP4 quantification approach, along with Grade Group (GG), was assessed for adverse radical prostatectomy pathology and biochemical recurrence. Results: We identified 15 different quantification approaches and observed differences between their discrimination. The highest discrimination was in the pixel-counting method (AUC 0.648). GP4 quantification outperformed GG for predicting adverse pathology (AUC 0.627 vs 0.608). Amount of GP3 was non-predictive once GP4 was known. These findings were consistent for BCR. Conclusions: We were able to measure slides using 15 distinct measurement approaches and replicated prior findings using ML to quantify GP4. Our findings support the use of ML as a research tool to compare different GP4 quantification approaches. We intend to use our method on larger cohorts to determine with which measurement approach best predicts oncologic outcome.

11
Large language models and retrieval augmented generation for complex clinical codelists: evaluating performance and assessing failure modes

Matthewman, J.; Denaxas, S.; Langan, S.; Painter, J. L.; Bate, A.

2026-04-24 health informatics 10.64898/2026.04.23.26351098 medRxiv
Top 0.6%
2.0%
Show abstract

Objectives: Large language models (LLMs) have shown promise in creating clinical codelists for research purposes, a time-consuming task requiring expert domain knowledge. Here, we evaluate the performance and assess failure modes of a retrieval augmented generation (RAG) approach to creating clinical codelists for the large and complex medical terminology used by the Clinical Practice Research Datalink (CPRD). Materials & Methods: We set up a RAG system using a database of word embeddings of the medical terminology that we created using a general-purpose word embedding model (gemini-embedding). We developed 7 reference codelists presenting different challenges and tagged required and optional codes. We ran 168 evaluations (7 codelists, 2 different database subsets, 4 models, 3 epochs each). Scoring was based on the omission of required codes, and inclusion of irrelevant codes. We used model-grading (i.e., grading by another LLM with the reference codelists provided as context) to evaluate the output codelists (a score of 0% being all incorrect and 100% being all correct). Results: We saw varying accuracy across models and codelists, with Gemini 3 Pro (Score 43%) generally performing better than Claude Sonnet 4.6 (36%), Gemini 3 Flash, and OpenAI GPT 5.2 performing worst (14%). Models performed better with shorter target codelists (e.g., Eosinophilic esophagitis with four codes, and Hidradenitis suppurativa with 14 codes). For example, all models consistently failed to produce a complete Wrist fracture codelist (with 214 required codes). We further present evaluation summaries, and failure mode evaluations produced by parsing LLM chat logs. Discussion: Besides demonstrating that a single-shot RAG approach is currently not suitable for codelist generation, we demonstrate failure modes including hallucinations, retrieval failures and generation failures where retrieved codes are not used. Conclusions: Our findings suggest that while RAG systems using current frontier LLMs may create correct clinical codelists in some cases, they still struggle with large and complex terminologies and codelists with a large number of codes. The failure mode we highlight can inform the creation of future workflows to avoid failures.

12
Attention-Guided CNN Ensemble for Binary Classification of High-Grade and Low-Grade Serous Ovarian Carcinoma from Histopathological WSI Patches

rani, a.; mishra, s.

2026-04-22 oncology 10.64898/2026.04.21.26351441 medRxiv
Top 0.9%
1.7%
Show abstract

Accurate histopathological differentiation between High-Grade Serous Carcinoma (HGSC) and Low-Grade Serous Carcinoma (LGSC) remains a critical yet challenging aspect of ovarian cancer diagnosis due to their similar morphology and different clinical outcomes. This study presents a deep learning framework that uses custom attention mechanisms, including the Convolutional Block Attention Module (CBAM), Squeeze-and-Excitation (SE) blocks, and a Differential Attention module within five CNN architectures for automated binary classification of ovarian cancer subtypes from H&E WSI patches. Although individual models achieved higher accuracy, the ensemble stacking framework with a shallow MLP meta-learner delivered the best overall performance, with a ROC-AUC of 0.9211, an accuracy of 0.85, and F1-scores of 0.84 and 0.85 across both subtypes. These findings demonstrate that attention-guided feature recalibration combined with ensemble stacking provides robust and clinically interpretable discrimination of ovarian carcinoma subtypes.

13
Identifying SARS-CoV-2 Lineages that Share the Same Relative Effective Reproduction Numbers

Musonda, R.; Ito, K.; Omori, R.; Ito, K.

2026-04-24 infectious diseases 10.64898/2026.04.22.26351531 medRxiv
Top 1.0%
1.7%
Show abstract

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has continuously evolved since its emergence in the human population in 2019. As of 1st August 2025, more than 1,700 Omicron subvariants have been designated by the Pango nomenclature system. The Pango nomenclature system designates a new lineage based on genetic and epidemiological information of SARS-CoV-2 strains. However, there is a possibility that strains that have similar genetic backgrounds and the same phenotype are given different Pango lineage names. In this paper, we propose a new algorithm, called FindPart-w, which can identify groups of viral lineages that share the same relative effective reproduction numbers. We introduced a new lineage replacement model, called the constrained RelRe model, which constrains groups of lineages to have the same relative effective reproduction numbers. The FindPart-w algorithm searches the equality constraints that minimise the Akaike Information Criterion of constrained RelRe models. Using hypothetical observation count data created by simulation, we found that the FindPart-w algorithm can identify groups of lineages having the same relative effective reproduction number in a practical computational time. Applying FindPart-w to actual real-world data of time-stamped lineage counts from the United States, we found that the Pango lineage nomenclature system may have given different lineage names to SARS-CoV-2 strains even if they have the same relative effective reproduction number and similar genetic backgrounds. In conclusion, this study showed that viruses that had the same relative effective reproduction number were identifiable from temporal count data of viral sequences. These findings will contribute to the future development of lineage designation systems that consider both genetic backgrounds and transmissibilities of lineages.

14
Differential effects of fenofibrate and fenofibric acid on the regulation of liver endothelial permeability

Luty, M. T.; Borah, D.; Szafranska, K.; Giergiel, M.; Trzos, K.; McCourt, P.; Lekka, M.; Kotlinowski, J.; Zapotoczny, B.

2026-04-20 cell biology 10.64898/2026.04.16.718907 medRxiv
Top 1%
1.4%
Show abstract

Background and AimsFenofibrate is widely prescribed for hyperlipidaemia and has been associated with rare but severe cases of drug-induced liver injury (DILI), yet its effects on liver sinusoidal endothelial cells (LSECs) remain to be investigated. LSECs maintain a highly permeable specialized sinusoidal barrier characterized by transcellular pores (fenestrations), regulating the bidirectional transfer of circulating compounds to and from the hepatocytes. As drug-induced alterations in fenestration architecture could influence xenobiotic access to hepatocytes, these changes may modulate pathways associated with DILI. Understanding the effects of fenofibrate on LSEC ultrastructure may therefore provide insights into previously underexplored endothelial contributions to hepatic drug responses. MethodsBoth fenofibrate and its active metabolite, fenofibric acid, were evaluated for their effects on LSEC ultrastructure, mechanical properties, and functional markers. Atomic force microscopy (AFM) and scanning electron microscopy (SEM) and were used to quantify fenestration architecture. AFM was additionally used to measure cellular mechanical properties, which were interpreted in the context of fluorescence-based quantification of cytoskeletal organization. Gene expression, viability, and cytotoxicity were assessed using PCR-based and biochemical assays. ResultsFenofibrate reduced fenestration number and porosity at both tested concentration (10, and 25 {micro}M). It also decreased the apparent Youngs modulus of LSECs, accompanied by changes in tubulin and actin architecture, without detectable cytotoxicity. In contrast, treatment with fenofibric acid did not result in significant structural or mechanical effects on LSECs, even at higher concentrations. ConclusionsTogether, these data identify LSECs as a drug-responsive hepatic cell type for fenofibrate, suggesting that LSECs could represent an underrecognized contributor to the complex, multifactorial processes underlying DILI. This work provides a framework for evaluating endothelial contributions to fenofibrate-associated liver effects in more complex models. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=105 SRC="FIGDIR/small/718907v1_ufig1.gif" ALT="Figure 1"> View larger version (51K): org.highwire.dtl.DTLVardef@1d3f60corg.highwire.dtl.DTLVardef@bea13aorg.highwire.dtl.DTLVardef@14b27d8org.highwire.dtl.DTLVardef@124e0d3_HPS_FORMAT_FIGEXP M_FIG Fenofibrate reduces LSEC fenestrations and metabolic activity at higher concentrations, while its metabolite, fenofibric acid, does not affect LSEC, regardless of its concentration. C_FIG

15
Analysis of a detoxified Escherichia coli strain for bacteriophage production

Welham, E.; Park de la Torriente, A.; Arng Lee, J.; Keith, M.; McAteer, S. P.; Paterson, G. K.; Gally, D. L.; Low, A. S.

2026-04-21 microbiology 10.64898/2026.04.21.719556 medRxiv
Top 2%
1.1%
Show abstract

Phage therapeutics are re-emerging as adjuncts or alternatives to antibiotics and their clinical translation will be enhanced with production methods that minimise downstream processing. We evaluated whether an endotoxin-reduced E. coli strain developed for production of recombinant proteins, ClearColi(R), can serve as a useful, safe phage production host without compromising yield and whether targeted receptor complementation can expand its utility. The parent strain BL21(DE3), and its lipid A modified derivative, ClearColi(R), were compared with respect to infection and generation of phage. Across a panel of 31 phage, a similar host range was observed between BL21(DE3) and ClearColi(R). To expand host range ompC was genetically engineered into the chromosome of ClearColi(R), thereby adding OmpC-dependent phage to its production capacity. Production metrics were broadly comparable between the hosts; efficiency of plating and final titres for representative phage were not significantly different; burst size varied by phage but without consistent host bias. Endotoxin activity in ClearColi(R)-propagated lysates was reduced by over 1000-fold relative to BL21(DE3), reaching the low hundreds of endotoxin units (EU) versus hundreds of thousands for BL21(DE3). Intravesical administration of ClearColi(R)-derived phage (LUC4) into pigs elicited no clinical abnormalities and no significant increases in circulating cytokines up to 48 hours after administration. ClearColi(R) allows efficient production of diverse phage with low endotoxin, reducing the requirement for downstream processing. Although its minimal LPS reduces its capacity for producing some LPS-dependent phage and its growth is slower than BL21(DE3), requiring optimisation for maximal phage titre, the safety and simplified manufacturing process support further development of endotoxin modified strains for phage production. Impact statementAntibiotic resistance is a current global problem and treatments based on phage and phage products already have a proven track record with particular bacterial infections, especially in the urinary tract. While progress is being made on in vitro phage synthesis, large scale bacteriophage preparations require a bacterial host for production, consequently toxic components in the initial lysate need to be removed or significantly diluted for safe clinical use. This is a study of the potential to utilise an endotoxin-reduced E. coli strain, ClearColi(R), to produce safer phage therapeutics. Such endotoxin modified strains should minimise the processing steps required and reduce overall production costs of a phage preparation. The research demonstrates that the endotoxin-reduced strain was able to produce a wide range of phage and for studied examples at phage titres equivalent to the more toxic parent strain. We also show that the strain can be modified to increase its host range and confirm the very low endotoxicity of basic phage lysates produced by the strain. Replicating this process to engineer additional low-toxicity bacterial production strains will accelerate the development of safer, more cost-effective phage therapeutics.

16
Translation, Validation, and Application of Indonesian Genetic Literacy Questionnaires for Medical Students

Kemal, R. A.; Dhani, R.; Simanjuntak, A. M.; Rafles, A. I.; Triani, H. X.; Rahmi, T. M.; Akbar, V. A.; Firdaus, F.; Pratama, B. F.; Zulharman, Z.

2026-04-25 medical education 10.64898/2026.04.17.26350524 medRxiv
Top 2%
1.0%
Show abstract

Background: Increasing relevance of genetics and molecular biology in medicine necessitates greater genetic literacy among healthcare workers. To assess the literacy level, a validated genetic literacy questionnaire is needed. Therefore, a standardised Indonesian-language genetic literacy questionnaire is essential. Aims: We aimed to translate and validate three genetic literacy questionnaires (PUGGS, iGLAS, and UNC-GKS) for use among Indonesian medical students. We then evaluated genetic literacy levels using one of the validated questionnaires. Methods: The PUGGS, iGLAS, and UNC-GKS questionnaires were translated into Indonesian and then reviewed by an expert panel for translational accuracy and conceptual appropriateness. Back-translation was performed to confirm validity. Initial Indonesian versions of the questionnaires underwent cognitive pre-testing with 12 undergraduate medical students. After refinements, the questionnaires were validated among 34 first- to third-year medical students. The Indonesian version of UNC-GKS questionnaire was then used to assess genetic literacy of 486 medical students comprising 228 preclinical medical students, 187 clerkships, and 71 residents. Results: The Indonesian versions of PUGGS (Cronbach's = 0.819) and UNC-GKS ( = 0.809) demonstrated good reliability, while iGLAS showed poor reliability ( = 0.315). Among the 486 students tested, 56% demonstrated moderate overall genetic literacy, and only 15.2% demonstrated good overall literacy. Basic genetic concepts were relatively well-understood with 54.3% having good literacy. On the contrary, gene variant's effects on health were poorly understood with only 9.7% having good literacy. Inheritance concepts were moderately understood with 24.9% having good literacy. Conclusion: The Indonesian translations of PUGGS and UNC-GKS are reliable tools for assessing genetic literacy among medical students. Using UNC-GKS, we observed predominantly moderate genetic literacy levels. Curriculum improvement to better integrate genetics education is essential to support its clinical applications.

17
Quantitative Assessment of Dual and Triple Energy Window Scatter Correction in Myocardial Perfusion SPECT with a 4D Phantom

El Bab, M.; Guvenis, A.

2026-04-25 cardiovascular medicine 10.64898/2026.04.17.26351095 medRxiv
Top 2%
1.0%
Show abstract

Conflicting evidence on scatter correction (SC) methods plagues quantitative myocardial perfusion SPECT (MPI), hindering standardized clinical protocols. This simulation study, utilizing the SIMIND Monte Carlo program and a highly realistic 4D XCAT phantom, systematically evaluates Dual Energy Window (DEW, with k=0.5) and Triple Energy Window (TEW) SC techniques. We uniquely investigate their performance across various photopeak window widths (2, 4, and 6 keV) and novel overlapped/non overlapped configurations specifically for Tc 99m MPI parameters largely unexplored in realistic cardiac models. Images were reconstructed with OSEM under uncorrected (UC), SC, and combined attenuation and scatter corrected (ACSC) conditions. Quantitative analysis focused on signal to noise ratio (SNR), contrast to noise ratio (CNR), defect contrast, and relative noise to background (RNB). Our findings consistently show ACSC's superior performance in CNR, SNR, and defect contrast, confirming its critical role. Interestingly, SC alone reduced noise but compromised defect contrast relative to UC, highlighting a potential trade-off without attenuation correction. Crucially, this study reveals minimal influence of photopeak window width and overlap configuration on image quality, and no significant difference between DEW and TEW across most metrics. These results provide essential evidence for optimizing quantitative MPI protocols, suggesting that for Tc 99m, the choice between DEW and TEW, and specific window settings, may be less critical than ensuring robust attenuation correction.

18
Nanopore Whole-Genome Sequencing for Rapid, Comprehensive Molecular Diagnostics of Brain Tumors in Adult Patients

Halldorsson, S.; Nagymihaly, R. M.; Bope, C. D.; Lund-Iversen, M.; Niehusmann, P.; Lien-Dahl, T.; Pahnke, J.; Bruning, T.; Kongelf, G.; Patel, A.; Sahm, F.; Euskirchen, P.; Leske, H.; Vik-Mo, E. O.

2026-04-24 pathology 10.64898/2026.04.23.26351563 medRxiv
Top 2%
0.9%
Show abstract

Background: Classification of central nervous system (CNS) tumors has become increasingly complex, raising concerns about the sustainability of comprehensive molecular diagnostics. We have evaluated nanopore whole genome sequencing (nWGS) as a single workflow to replace multiple diagnostic assays. Methods: We performed nWGS on DNA extracted from 90 adult CNS tumor samples (58 retrospective, 32 prospective) and compared the results to findings from standard of care (SoC) diagnostic work-up. Analysis was done through an automated workflow that consolidated diagnostically and therapeutically relevant genomic alterations, including copy-number variation, structural, and single-nucleotide variants, chromosomal aberrations, gene fusions, and methylation-based classification. Results: nWGS supported final diagnostic classification in all samples with >15% tumor cell content, requiring ~3 hours of hands-on library preparation, parallel sample processing, and sequencing times within 72 hours. Methylation-based classification was available within 1 hour and was concordant with the integrated final diagnosis in 89% of cases (80/90). All diagnostically relevant copy-number variations, single-nucleotide variants, and gene fusions were concordant with SoC testing. MGMT promoter methylation status matched in 94% of cases. In addition, nWGS identified prognostic and potentially actionable variants that were not reported or covered by SoC. Conclusions: nWGS delivers comprehensive genetic and epigenetic results with a fast turn-around compared to standard methods. This enables efficient, accurate, and scalable molecular diagnostics of CNS tumors using a single platform. This data supports its implementation in routine clinical practice and may be extended to other cancer types requiring complex genomic profiling.

19
Chinese Herbal Medicine as a complementary therapy for the management of Colorectal Cancer: Study protocol for a Delphi Expert Consensus survey

Ng, C. Y.; Liu, M.; Ai, D.; Yao, L.; Yang, M.; Zhong, L. L.

2026-04-22 oncology 10.64898/2026.04.21.26350990 medRxiv
Top 2%
0.9%
Show abstract

IntroductionColorectal cancer (CRC) remains a leading cause of cancer-related morbidity and mortality worldwide, despite advances in conventional oncological therapies. In recent years, various studies have made advances in integrative oncology, such as investigating the use of Chinese Herbal Medicine (CHM) as a complementary therapy alongside conventional oncological therapies to alleviate treatment-related adverse effects, improve quality of life, and potentially enhance therapeutic outcomes. Despite this, clinical practice in this area remains highly heterogeneous, with limited standardized guidelines on key areas of concern such as (1) optimal intervention, (2) recommended stage and duration of intervention, (3) safety considerations, and (4) possible herb-drug interactions. Hence, this study aims to establish expert consensus on the usage of CHM as a complementary therapy in the management of CRC, to support safe, consistent, and evidence-informed clinical practice. Methods and AnalysisWe will employ a modified Delphi technique to achieve consensus amongst a panel of international experts in various fields related to integrative oncology. Prior to the study, a list of questionnaire items was developed based on a systematic review of existing clinical practice guidelines on CRC. An international panel will be invited based on established international profile in integrative oncology research and clinical practice, and by peer referral. Two rounds of Delphi will be conducted using anonymous online questionnaires. Consensus will be considered reached if at least 50% of the panel strongly agree/disagree that an item should be included or excluded while strong consensus will be set at 76%. Items which achieve strong consensus after Round 1 will be removed, before being sent out for Round 2 with a summary of Round 1 responses for a final consensus. Ethics and DisseminationEthics approval has been obtained from the Institutional Review Board of Nanyang Technological University (IRB-2025-1222). Our findings will be disseminated through peer-reviewed publications and conference presentations. Strengths and limitations of this studyO_LIThis study will develop an expert consensus which aims to guide future integration of Chinese Herbal Medicine (CHM) as a complementary therapy into colorectal cancer (CRC) management. C_LIO_LIKey concerns in areas such as determining the (1) optimal intervention, (2) recommended stage and duration of intervention, (3) safety considerations, and (4) possible herb-drug interactions, thereby laying the groundwork for potential future incorporation of CHM into CRC treatment protocols alongside conventional oncology approaches has been identified, thus limiting implementation in clinical practice. C_LIO_LIDesigning a study e-guide, followed by the consensus rounds study online will facilitate participants responses and the dissemination of information from previous rounds. C_LI

20
Practical quantification of immunohistochemistry antigen concentrations and reaction-diffusion parameters

Peale, F. V.; Perng, W.; Mbiribindi, B.; Andrews, B. T.; Wang, X.; Dunlap, D.; Eastham, J.; Ngu, H.; Chernyshev, A.; Orlova, D.

2026-04-21 pathology 10.64898/2026.04.16.719078 medRxiv
Top 2%
0.9%
Show abstract

The immunohistochemistry (IHC) methods widely used in diagnostic medicine and biomedical research are kinetically complex reaction-diffusion processes that, ideally, produce stain intensities correlated with the local antigen concentration. Yet after 75 years of use, practical theoretical tools to rigorously plan and interpret IHC experiments are still lacking. Because modeling the reactions requires time-consuming computer simulation, impractical for regular use, most protocols are optimized empirically, without detailed knowledge of the reaction rates and antigen-antibody equilibria. The resulting stain intensities can be calibrated against standards with known antigen abundance, but they are typically not interpretable in terms of chemical antigen concentrations. To address these limitations, we developed a fast interpolation method to model reaction-diffusion behavior, and experimental methods to characterize IHC kinetic parameters in formalin-fixed paraffin-embedded (FFPE) samples. Used together, these allow experimental measurement of both the chemical concentration of antigen in the sample and the reaction-diffusion parameters consistent with the assay results. Results show 1) direct immunofluorescent detection has low nanomolar sensitivity with >1000-fold dynamic range, and 2) antibody diffusion rates in FFPE samples can be >1000-fold slower than in aqueous solutions, producing diffusion-limited conditions in which the IHC reaction time course may depend on the sample antigen concentration. Awareness of these details is necessary to avoid potential underestimation of both the absolute and relative antigen concentrations in different samples that may occur if staining is stopped before reaching equilibrium. Software tools are provided to allow users to rapidly model IHC reaction time courses and to fit experimental time course data with candidate reaction parameters. The principles described here apply equally to other tissue-based "spatial omics" analyses and should be considered when designing and interpreting experiments requiring any macromolecule to diffuse into and react in a tissue section. SIGNIFICANCEThe theoretical and experimental framework described here advances IHC staining from a qualitative or semi-quantitative method towards a more rigorously quantitative assay. The practical ability to predict IHC reaction kinetics and fit reaction parameters to experimental data has the potential to advance IHC applications in diagnostic medicine and biomedical research in three ways: 1) interpretation of experimental and diagnostic samples stained under different conditions can be more objective, facilitating comparison of results from different protocols and different laboratories; 2) IHC staining can be interpreted as molar chemical antigen-antibody concentrations calculated from the reaction parameters measured in the studied sample; 3) the correlation between antigen concentration and biological behavior can be examined more reliably. Practical software tools are provided.